An Exploratory Analysis of a Large Health Cohort Study Using Bayesian Networks

نویسندگان

  • Delin Shen
  • Peter Szolovits
  • Martha L. Gray
چکیده

Large health cohort studies are among the most effective ways in studying the causes, treatments and outcomes of diseases by systematically collecting a wide range of data over long periods. The wealth of data in such studies may yield important results in addition to the already numerous findings, especially when subjected to newer analytical methods. Bayesian Networks (BN) provide a relatively new method of representing uncertain relationships among variables, using the tools of probability and graph theory, and have been widely used in analyzing dependencies and the interplay between variables. We used BN to perform an exploratory analysis on a rich collection of data from one large health cohort study, the Nurses' Health Study (NHS), with the focus on breast cancer. We explored the data from the NHS using BN to look for breast cancer risk factors, including a group of Single Nucleotide Polymorphisms (SNP). We found no association between the SNPs and breast cancer, but found a dependency between clomid and breast cancer. We evaluated clomid as a potential riskfactor after matching on age and number of children. Our results showed for clomid an increased risk of estrogen receptor positive breast cancer (odds ratio 1.52, 95% CI 1.11-2.09) and a decreased risk of estrogen receptor negative breast cancer (odds ratio 0.46, 95% CI 0.22-0.97). We developed breast cancer risk models using BN. We trained models on 75% of the data, and evaluated them on the remaining. Because of the clinical importance of predicting risks for Estrogen Receptor positive and Progesterone Receptor positive breast cancer, we focused on this specific type of breast cancer to predict two-year, four-year, and six-year risks. The concordance statistics of the prediction results on test sets are 0.70 We also evaluated the calibration performance of the models, and applied a filter to the output to improve the linear relationship between predicted and observed risks using Agglomerative Information Bottleneck clustering without sacrificing much discrimination performance.To my grandmother, my parents, and my wife-3

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The modeling of body's immune system using Bayesian Networks

In this paper, the urinary infection, that is a common symptom of the decline of the immune system, is discussed based on the well-known algorithms in machine learning, such as Bayesian networks in both Markov and tree structures. A large scale sampling has been executed to evaluate the performance of Bayesian network algorithm. A number of 4052 samples wereobtained from the database of the Tak...

متن کامل

Construct Validation of the Health Literacy Questionnaire (HLQ) in Shahrekord Cohort Study, Iran

Background: Health literacy promotion is considered to be an important goal in the healthcare strategic planning of every country. The present study aimed to evaluate the validity and reliability of the health literacy questionnaire (HLQ) in the participants of Shahrekord cohort study, Iran.  Methods: This cross-sectional study was conducted on 400 respondents who were selected via systematic,...

متن کامل

A Bayesian Networks Approach to Reliability Analysis of a Launch Vehicle Liquid Propellant Engine

This paper presents an extension of Bayesian networks (BN) applied to reliability analysis of an open gas generator cycle Liquid propellant engine (OGLE) of launch vehicles. There are several methods for system reliability analysis such as RBD, FTA, FMEA, Markov Chains, and etc. But for complex systems such as LV, they are not all efficiently applicable due to failure dependencies between compo...

متن کامل

Estimation of Products Final Price Using Bayesian Analysis Generalized Poisson Model and Artificial Neural Networks

Estimating the final price of products is of great importance. For manufacturing companies proposing a final price is only possible after the design process over. These companies propose an approximate initial price of the required products to the customers for which some of time and money is required. Here using the existing data of already designed transformers and utilizing the bayesian anal...

متن کامل

Risk Analysis of Operating Room Using the Fuzzy Bayesian Network Model

To enhance Patient’s safety, we need effective methods for risk management. This work aims to propose an integrated approach to risk management for a hospital system. To improve patient’s safety, we should develop flexible methods where different aspects of risk and type of information are taken into consideration. This paper proposes a fuzzy Bayesian network to model and analyze risk in the op...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006